Digging for diamonds: Identifying valuable end-user code in repositories

نویسندگان

  • Jarrod Jackson
  • Kathryn Stolee
  • Christopher Scaffidi
چکیده

To a large extent, repositories of end-user code are “write-only”: much of the code that people publish never sees substantial reuse. Yet buried within these repositories are valuable pieces of code, though finding them is not always easy. In prior work, we developed a model that can predict, when a web macro is created, whether that script will be reused by anybody. In the current paper, we analyze data from two other end-user repositories to investigate the model’s generalizability to other kinds of code. We find that the model performs well for a wide range of different purposes and configurations, including predicting future reuse events based on data about past events, indicating that the model could serve as an effective basis for future repository enhancements.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using traits of web macro scrips to predict reuse

To help people find code that they might want to reuse, repositories of end-user code typically sort scripts by number of downloads, ratings, or other information based on prior uses of the code. However, this information is unavailable when code is new or when it has not yet been reused. Addressing this problem requires identifying reusable code based solely on information that exists when a s...

متن کامل

Evaluation of Digital Repositories from an End-users' Perspective: The Case of the reUSE project

Along with the long-term preservation of digital publications the next important goal is public access and in this regard user-centred design of digital repositories. Several repositories worldwide have shown that users are their most critical element. Repositories as such are valuable only if used. The success of the repository is often influenced by the content and the design of the interface...

متن کامل

Using traits of web macro scripts to predict reuse

To help people find code that they might want to reuse, repositories of end-user code typically sort scripts by number of downloads, ratings, or other information based on prior uses of the code. However, this information is unavailable when code is new or when it has not yet been reused. Addressing this problem requires identifying reusable code based solely on information that exists when a s...

متن کامل

Integrating S6 code search and Code Bubbles

We wanted to provide a tool for doing code search over open source repositories as part of the Code Bubbles integrated development environment. Integrating code search as a plug-in to Code Bubbles required substantial changes to the S6 code search engine and the development of appropriate user interfaces in Code Bubbles. After briefly reviewing Code Bubbles and the S6 search engine, this paper ...

متن کامل

Towards Knowledge Discovery in Software Repositories to Support Refactoring

Software repositories are typically used to store code together with additional information. These repositories are a valuable source to train knowledge discovery algorithms to detect code smells and other qualitative defects. In this paper we present a lightweight framework to detect previously unknown knowledge from software reposit ories to support refactoring. The results will be usable by ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010